Interruption criteria for block decoding

ABSTRACT

While decoding a representation, imported from a channel, of a codeword that encodes K information bits as N&gt;K codeword bits, by updating estimates of the codeword bits in a plurality of iterations, the iterations are interrupted upon satisfaction of an interruption criterion that is either an order-dependent interruption criterion or an interruption criterion that includes an estimate of mutual information of the codeword and a vector that is used in the decoding iterations. Either the iterations are terminated or the iterations are resumed after one or more elements of one or more vectors used in the iterations is/are modified.

This is a continuation-in-part of U.S. patent application Ser. No.12/469,790, filed May 21, 2009, that claims the benefit of U.S.Provisional Patent Application No. 61/074,701, filed Jun. 23, 2008.

FIELD AND BACKGROUND OF THE INVENTION

Disclosed herein is a method of iterative block decoding, for exampleLow-Density Parity Check (LDPC) decoding, that uses innovativeinterruption criteria, and associated devices.

Error Correction Codes (ECCs) are commonly used in communication systemsand in storage systems. Various physical phenomena occurring both incommunication channels and in storage devices result in noise effectsthat corrupt the communicated or stored information. Error correctioncoding schemes can be used for protecting the communicated or storedinformation against the resulting errors. This is done by encoding theinformation before transmission through the communication channel orstorage in the memory device. The encoding process transforms theinformation bits sequence into a codeword by adding redundancy to theinformation. This redundancy can then be used in order to recover theinformation from the possibly corrupted codeword through a decodingprocess.

In both communication systems and storage systems an information bitsequence i is encoded into a coded bit sequence v that is modulated ormapped into a sequence of symbols x that is adapted to the communicationchannel or to the memory device. At the output of the communicationchannel or memory device a sequence of symbols y is obtained. An ECCdecoder of the system decodes the sequence y and recovers the bitsequence î, which should reconstruct the original information bitsequence i with high probability.

A common ECC family is the family of linear binary block codes. A lengthN linear binary block code of dimension K is a linear mapping of lengthK information bit sequences into length N codewords, where N>K. The rateof the code is defined as R=K/N. The encoding process of a codeword v ofdimension 1×N is usually done by multiplying the information bitssequence i of dimension 1×K by a generator matrix G of dimension K×Naccording to

v=i·G  (1)

It is also customary to define a parity-check matrix H of dimension M×N,where M=N−K. The parity-check matrix is related to the generator matrixthrough the following equation:

GH ^(T)=0  (2)

The parity-check matrix can be used in order to check whether a length Nbinary vector is a valid codeword. A 1×N binary vector v belongs to thecode if and only if the following equation holds:

H·v′= 0  (3)

(In equation (3), the prime on v′ means that v′ is a column vector.)

In recent years iterative coding schemes have become very popular. Inthese schemes the code is constructed as a concatenation of severalsimple constituent codes and is decoded using an iterative decodingalgorithm by exchanging information between the constituent decoders ofthe simple codes. Usually, the code can be defined using a bipartitegraph describing the interconnections between the constituent codes. Inthis case, decoding can be viewed as an iterative message passing overthe graph edges.

A popular class of iterative codes is Low-Density Parity-Check (LDPC)codes. An LDPC code is a linear binary block code defined by a sparseparity-check matrix H. As shown in FIG. 1, the code can be definedequivalently by a sparse bipartite graph G==(V,C,E) (also called a“Tanner graph”) with a set V of N bit nodes (N=13 in FIG. 1), a set C ofM check nodes (M=10 in FIG. 1) and a set E of edges (E=38 in FIG. 1)connecting bit nodes to check nodes. The bit nodes correspond to thecodeword bits and the check nodes correspond to parity-check constraintson the bits. A bit node is connected by edges to the check nodes thatthe bit node participates with. In the matrix representation of the codeon the left side of FIG. 1 an edge connecting bit node i with check nodej is depicted by a non-zero matrix element at the intersection of row jand column i.

Next to the first and last check nodes of FIG. 1 are shown theequivalent rows of equation (3). The symbol “⊕” means “XOR”.

LDPC codes can be decoded using iterative message passing decodingalgorithms. These algorithms operate by exchanging messages between bitnodes and check nodes along the edges of the underlying bipartite graphthat represents the code. The decoder is provided with initial estimatesof the codeword bits (based on the communication channel output or basedon the read memory content). These initial estimates are refined andimproved by imposing the parity-check constraints that the bits shouldsatisfy as a valid codeword (according to equation (3)). This is done byexchanging information between the bit nodes representing the codewordbits and the check nodes representing parity-check constraints on thecodeword bits, using the messages that are passed along the graph edges.

In iterative decoding algorithms, it is common to utilize “soft” bitestimations, which convey both the bit estimations and the reliabilitiesof the hit estimations.

The bit estimations conveyed by the messages passed along the graphedges can be expressed in various forms. A common measure for expressinga “soft” bit estimation is as a Log-Likelihood Ratio (LLR)

${\log \frac{\Pr\left( {v = {0{{current}\mspace{14mu} {constraints}\mspace{14mu} {and}\mspace{14mu} {observations}}}} \right)}{\Pr\left( {v = {1{{current}\mspace{14mu} {constraints}\mspace{14mu} {and}\mspace{14mu} {observations}}}} \right)}},$

where the “current constraints and observations” are the variousparity-check constraints taken into account in computing the message athand and the observations y corresponding to the bits participating inthese parity checks. Without loss of generality, for simplicity weassume hereinafter that LLR messages are used throughout. The sign ofthe LLR provides the bit estimation (i.e., positive LLR corresponds tov=0 and negative LLR corresponds to v=1). The magnitude of the LLRprovides the reliability of the estimation (i.e., |LLR|=0 means that theestimation is completely unreliable and |LLR|±∞ means that theestimation is completely reliable and the bit value is known).

Usually, the messages passed during the decoding along the graph edgesbetween bit nodes and check nodes are extrinsic. An extrinsic message mpassed from a node n on an edge e takes into account all the valuesreceived on edges connected to n other than edge e (this is why themessage is called extrinsic: it is based only on new information).

One example of a message passing decoding algorithm is theBelief-Propagation (BP) algorithm, which is considered to be the bestalgorithm from among this family of message passing algorithms.

Let

$P_{v} = {\log \frac{\Pr\left( {v = {0y}} \right)}{\Pr\left( {v = {1y}} \right)}}$

denote the initial decoder estimation for bit v, based only on thereceived or read symbol y. Note that it is also possible that some ofthe bits are not transmitted through the communication channel or storedin the memory device, hence there is no y observation for these bits. Inthis case, there are two possibilities: 1) shortened bits—the bits areknown a-priori and P_(v)±∞ (depending on whether the bit is 0 or 1). 2)punctured bits—the bits are unknown a-priori and

${P_{v} = {\log \frac{\Pr \left( {v = 0} \right)}{\Pr \left( {v = 1} \right)}}},$

where Pr(v=0) and Pr(v=1) are the a-priori probabilities that the bit vis 0 or 1 respectively. Assuming the information bits have equala-priori probabilities to be 0 or 1 and assuming the code is linearthen.

$P_{v} = {{\log \frac{1/2}{1/2}} = 0.}$

Let

$Q_{v} = {\log \frac{\Pr \left( {{v = {0\underset{\_}{y}}},{{H \cdot \underset{\_}{v}} = 0}} \right)}{\Pr \left( {{v = {1\underset{\_}{y}}},{{H \cdot \underset{\_}{v}} = 0}} \right)}}$

denote the final decoder estimation for bit v, based on the entirereceived or read sequence y and assuming that bit v is part of acodeword (i.e., assuming H·v=0).

Let Q_(vc) denote a message from bit node v to check node c. Let R_(cv)denote a message from check node c to bit node v.

The BP algorithm utilizes the following update rules for computing themessages:

The bit node to check node computation rule is:

$\begin{matrix}{Q_{vc} = {P_{v} + {\sum\limits_{c^{\prime} \in {{N{({v,G})}}\backslash c}}\; R_{c^{\prime}v}}}} & (4)\end{matrix}$

Here, N(n,G) denotes the set of neighbors of a node n in the graph G andc′εN(v,G)\c refers to those neighbors excluding node ‘c’ (the summationis over all neighbors except c).

The check node to bit node computation rule is:

$\begin{matrix}{R_{cv} = {\phi^{- 1}\left( {\sum\limits_{v^{\prime} \in {{N{({c,G})}}\backslash v}}{\phi \left( Q_{v^{\prime}c} \right)}} \right)}} & (5)\end{matrix}$

Here,

${\phi (x)} = \left\{ {{{sign}(x)},{{- \log}\; \tan \; {h\left( \frac{x}{2} \right)}}} \right\}$

and operations in the φ domain are done over the group {0,1}×R⁺ (thisbasically means that the summation here is defined as summation over themagnitudes and XOR over the signs). Analogous to the notation ofequation (4), N(c,G) denotes the set of bit node neighbors of a checknode c in the graph G and v′εN(c,G)\v refers to those neighborsexcluding node ‘v’ (the summation is over all neighbors except v).

The final decoder estimation for bit v is:

$\begin{matrix}{Q_{v} = {P_{v} + {\sum\limits_{c^{\prime} \in {N{({v,G})}}}R_{c^{\prime}v}}}} & (6)\end{matrix}$

The order of passing messages during message passing decoding is calledthe decoding schedule. BP decoding does not imply utilizing a specificschedule—it only defines the computation rules (equations (4), (5) and(6)). The decoding schedule does not affect the expected errorcorrection capability of the code. However, the decoding schedule cansignificantly influence the convergence rate of the decoder and thecomplexity of the decoder.

The standard message-passing schedule for decoding LDPC code is theflooding schedule, in which in each iteration all the variable nodes,and subsequently all the check nodes, pass new messages to theirneighbors (R. G. Gallager, Low-Density Parity-Check Codes, Cambridge,Mass.: MIT Press 1963). The standard BP algorithm based on the floodingschedule is given in FIG. 2.

The standard implementation of the BP algorithm based on the floodingschedule is expensive in terms of memory requirements. We need to storea total of 2|V|+2|E| messages (for storing the P_(v), Q_(v), Q_(vc), andR_(cv) messages). Moreover, the flooding schedule exhibits a lowconvergence rate and hence requires higher decoding logic (e.g., moreprocessors on an ASIC) for providing a required error correctioncapability at a given decoding throughput.

More efficient, serial message passing decoding schedules, are known. Ina serial message passing schedule, the bit or check nodes are seriallytraversed and for each node, the corresponding messages are sent intoand out from the node. For example, a serial schedule can be implementedby serially traversing the check nodes in the graph in some order andfor each check node cεC the following messages are sent:

1. Q_(vc) for each vεN(c) (i.e., all Q_(vc) messages into the node c)

2. R_(cv) for each vεN(c) (i.e., all R_(cv) messages from node c)

Serial schedules, in contrast to the flooding schedule, enable immediateand faster propagation of information on the graph resulting in fasterconvergence (approximately two times faster). Moreover, serial schedulecan be efficiently implemented with a significant reduction of memoryrequirements. This can be achieved by using the Q_(v) messages and theR_(cv) messages in order to compute the Q_(vc) messages on the fly, thusavoiding the need to use an additional memory for storing the Q_(vc)messages. This is done by expressing Q_(vc) as (Q_(v)−R_(cv)) based onequations (4) and (6). Furthermore, the same memory as is initializedwith the a-priori messages P_(v) is used for storing the iterativelyupdated Q_(v) a-posteriori messages. An additional reduction in memoryrequirements is obtained because in the serial schedule we only need touse the knowledge of N(c) ∀cεC, while in the standard implementation ofthe flooding schedule we use both data structures N(c) ∀cεC and N(v)∀vεV requiring twice as much memory for storing the code's graphstructure. The serially scheduled decoding algorithm appears in FIG. 3.

To summarize, serial decoding schedules have the following advantagesover the flooding schedule:

-   1) Serial decoding schedules speed up the convergence by a factor of    2 compared to the standard flooding schedule. This means that we    need only half the decoder logic in order to provide a given error    correction capability at a given to throughput, compared to a    decoder based on the flooding schedule.-   2) Serial decoding schedules provide a memory-efficient    implementation of the decoder. A RAM for storing only |V|+|E|    messages is needed (instead of for storing 2|V|+2|E| messages as in    the standard flooding schedule). Half the ROM size for storing the    code's graph structure is needed compared to the standard flooding    schedule.-   3) “On-the-fly” convergence testing can be implemented as part of    the computations done during an iteration, allowing convergence    detection during an iteration and decoding termination at any point.    This can save on decoding time and energy consumption.

Iterative coding systems exhibit an undesired effect called error flooras shown in FIG. 4, where, below a certain “noise” level in thecommunication channel or in the memory device, the Block Error Rate(BER) at the output of the decoder starts to decrease much more slowlyeven though the “noise” that is responsible for the bit errors becomessmaller. This effect is problematic, especially in storage systems,where the required decoder output block error rate should be very small(˜10⁻¹⁰). Note that in FIG. 4 the noise increases to the right.

It is well known that the error correction capability and the errorfloor of an iterative coding system improve as the code length increases(this is true for any ECC system, but especially for iterative codingsystems, in which the error correction capability is rather poor atshort code lengths).

While properly designed LDPC codes are very powerful, and can correct alarge number of errors in a code word, a phenomenon known as “trappingsets” may cause the decoder to fail, and increase the error floor of thecode, even though the number of incorrect bits may be very small and maybe confined to certain regions in the graph. Trapping sets are not welldefined for general LDPC codes, but have been described as: “These aresets with a relatively small number of variable nodes such that theinduced sub-graph has only a small number of odd degree check nodes.”

Trapping sets are related to the topology of the LDPC graph and to thespecific decoding algorithm used, are hard to avoid and are hard toanalyze.

Trapping sets are a problem in the field of storage since historicallythe reliability required from storage devices is relatively high, forexample 1 bit error per 10¹⁴ stored bits. The result is that codesemployed in memory device such as flash memory devices should exhibitlow error floor, but trapping sets increase the error floor.

One known way of dealing with trapping sets is to decode a codeword intwo phases. During the first phase, conventional iterative decoding isperformed along the graph defined by the LDPC code. If a trapping set issuspected to exist, which prevents the decoding process from convergingto a legal codeword (i.e., a codeword that satisfies all the parityequations), then the conventional iterative decoding is interrupted andthe second phase of the decoding is entered. In the second phase, someof the values associated with the nodes of the graph of the code aremodified. For example, the values of the check node messages R_(cv) maybe reset to zero, or the magnitudes of the soft values Q_(v)corresponding to bit probabilities may be truncated to no more than apredetermined value, typically a value between 10 and 16.

In multi-phase decoding scenarios, if the decoder of one phase fails toconverge, it would be useful to have a criterion for early interruptionof the decoding, in order to transition to the next decoding phase,rather than waiting to perform some predefined large number ofiterations before deciding to interrupt the decoding. Such a criterionwould save time and power during the phase that fails by skipping mostof the non-converging iterations.

DEFINITIONS

The methods described herein are applicable to correcting errors in datain at least two different circumstances. One circumstance is that inwhich data are retrieved from a storage medium. The other circumstanceis that in which data are received from a transmission medium. Both astorage medium and a transmission medium are special cases of a“channel” that adds errors to the data. The concepts of “retrieving” and“receiving” data are generalized herein to the concept of “importing”data. Both “retrieving” data and “receiving” data are special cases of“importing” data from a channel.

Typical storage media to which the technology described below isapplicable are nonvolatile memories such as flash memories.

The data that are decoded by the methods presented herein are arepresentation of a codeword. The data are only a “representation” ofthe codeword, and not the codeword itself, because the codeword mighthave been corrupted by noise in the channel before one of the methods isapplied for decoding.

SUMMARY OF THE INVENTION

Because the existence of a trapping set implies that a small number ofbits are failing to converge correctly, the likely existence of atrapping set may be identified, and the iterative decoding interrupted,if all but a small number of bits are stable during successiveiterations of the decoding, or if a small number of parity checkequations fail consistently while all other parity check equations aresatisfied. These exemplary “interruption criteria”, that suggest theexistence of a trapping set, are generalized herein to cases of slowconvergence or non-convergence that are not necessarily caused bytrapping sets.

Although there is an incentive to use the serial scheduler built-inmechanism for convergence detection via the number of satisfied checkequations in order to terminate or interrupt the decoding process incase the decoder is not likely to converge, for example if a trappingset is formed during the decoding operation, the innovative interruptioncriteria presented below are applicable to message passing schedulesgenerally, including flooding schedules, and not just to serialschedules. For example, decoding interruption criteria that are based onproperties of an integral number of iterations cal be applied to aflooding scheduler with the same efficiency as to a serial scheduler.

One embodiment provided herein is a method of decoding a representationof a codeword that encodes K information bits as N>K codeword bits, themethod including: (a) importing the representation of the codeword froma channel; (b) in a plurality of decoding iterations, updating estimatesof the codeword bits; and (c) interrupting the decoding iterations if anorder-dependent interruption criterion is satisfied.

Another embodiment provided herein is a decoder for decoding arepresentation of a codeword that encodes K information bits as N>Kcodeword bits, including a processor for executing an algorithm fordecoding the representation of the codeword by steps including: (a) in aplurality of decoding iterations, updating estimates of the codewordbits; and (b) interrupting the decoding iterations if an order-dependentinterruption criterion is satisfied.

Another embodiment provided herein is a memory controller including: (a)an encoder for encoding K information bits as a codeword of N>K codewordbits; and (b) a decoder including a processor for executing an algorithmfor decoding a representation of the codeword by steps including: (i) ina plurality of decoding iterations, updating estimates of the codewordbits, and (ii) interrupting the decoding iterations if anorder-dependent interruption criterion is satisfied.

Another embodiment provided herein is a receiver including: (a) ademodulator for demodulating a message received from a communicationchannel, thereby producing a representation of a codeword that encodes Kinformation bits as N>K codeword bits; and (b) a decoder including aprocessor for executing an algorithm for decoding the representation ofthe codeword by steps including: (i) in a plurality of decodingiterations, updating estimates of the codeword bits, and (ii)interrupting the decoding iterations if an order-dependent interruptioncriterion is satisfied.

Another embodiment provided herein is a communication system fortransmitting and receiving a message, including: (a) a transmitterincluding: (i) an encoder for encoding K information bits of the messageas a codeword of N>K codeword bits, and (ii) a modulator fortransmitting the codeword via a communication channel as a modulatedsignal; and (b) a receiver including: (i) a demodulator for receivingthe modulated signal from the communication channel and for demodulatingthe modulated signal, thereby producing a representation of thecodeword, and (ii) a decoder including a processor for executing analgorithm for decoding the representation of the codeword by stepsincluding: (A) in a plurality of decoding iterations, updating estimatesof the codeword bits, and (B) interrupting the decoding iterations if anorder-dependent interruption criterion is satisfied.

Another embodiment provided herein is a computer readable storage mediumhaving computer readable code embodied on the computer readable storagemedium, the computer readable code for decoding a representation of acodeword that encodes K information bits as N>K codeword bits, thecomputer readable code including: (a) program code for, in a pluralityof decoding iterations, updating estimates of the codeword bits; and (b)program code for interrupting the decoding iterations if anorder-dependent interruption criterion is satisfied.

Another embodiment provided herein is a method of decoding arepresentation of a codeword that encodes K information bits as N>Kcodeword bits, the method including: (a) importing the representation ofthe codeword from a channel; (b) in a plurality of decoding iterations,updating estimates of the codeword bits; and (c) interrupting thedecoding iterations if an interruption criterion, that includes anestimate of mutual information between the codeword and a vector that isused in the decoding iterations, is satisfied.

Another embodiment provided herein is a decoder for decoding arepresentation of a codeword that encodes K information bits as N>Kcodeword bits, including a processor for executing an algorithm fordecoding the representation of the codeword by steps including: (a) in aplurality of decoding iterations, updating estimates of the codewordbits; and (b) interrupting the decoding iterations if an interruptioncriterion, that includes an estimate of mutual information between thecodeword and a vector that is used in the decoding iterations, issatisfied.

Another embodiment provided herein is a memory controller including: (a)an encoder for encoding K information bits as a codeword of N>K codewordbits; and (b) a decoder including a processor for executing an algorithmfor decoding a representation of the codeword by steps including: (i) ina plurality of decoding iterations, updating estimates of the codewordbits, and (ii) interrupting the decoding iterations if an interruptioncriterion, that includes an estimate of mutual information between thecodeword and a vector that is used in the decoding iterations, issatisfied.

Another embodiment provided herein is a receiver including: (a) ademodulator for demodulating a message received from a communicationchannel, thereby producing a representation of a codeword that encodes Kinformation bits as N>K codeword bits; and (b) a decoder including aprocessor for executing an algorithm for decoding the representation ofthe codeword by steps including: (i) in a plurality of decodingiterations, updating estimates of the codeword bits, and (ii)interrupting the decoding iterations if an interruption criterion, thatincludes an estimate of mutual information between the codeword and avector that is used in the decoding iterations, is satisfied.

Another embodiment provided herein is a communication system fortransmitting and receiving a message, including: (a) a transmitterincluding: (i) an encoder for encoding K information bits of the messageas a codeword of N>K codeword bits, and (ii) a modulator fortransmitting the codeword via a communication channel as a modulatedsignal; and (b) a receiver including: (i) a demodulator for receivingthe modulated signal from the communication channel and for demodulatingthe modulated signal, thereby producing a representation of thecodeword, and (ii) a decoder including a processor for executing analgorithm for decoding the representation of the codeword by stepsincluding: (A) in a plurality of decoding iterations, updating estimatesof the codeword bits, and (B) interrupting the decoding iterations if aninterruption criterion, that includes an estimate of mutual informationbetween the codeword and a vector that is used in the decodingiterations, is satisfied.

Another embodiment provided herein is a computer readable storage mediumhaving computer readable code embodied on the computer readable storagemedium, the computer readable code for decoding a representation of acodeword that encodes K information bits as N>K codeword bits, thecomputer readable code including: (a) program code for, in a pluralityof decoding iterations, updating estimates of the codeword bits; and (b)program code for interrupting the decoding iterations if an interruptioncriterion, that includes an estimate of mutual information between thecodeword and a vector that is used in the decoding iterations, issatisfied.

Another embodiment provided herein is a method of decoding arepresentation of a codeword that has been encoded according to a code,including: (a) importing the representation of the codeword from achannel; (b) providing an iterative procedure for decoding therepresentation of the codeword; and (c) following at most one iterationof the iterative procedure, deciding, according to a criterion thatdepends on at least one parameter selected from the group consisting ofa syndrome of the representation of the codeword, a degree of the codeand a rate of the code, whether to modify the iterative procedure.

Another embodiment provided herein is a decoder for decoding arepresentation of a codeword that has been encoded according to a code,including a processor for executing an algorithm for deciding, followingat most one iteration of an iterative procedure for decoding therepresentation of the codeword, and according to a criterion thatdepends on at least one parameter selected from the group consisting ofa syndrome of the representation of the codeword, a degree of the codeand a rat of the code, whether to modify the iterative procedure.

Another embodiment provided herein is a memory controller, forcontrolling a memory, including: (a) an encoder for encoding a pluralityof bits as a codeword according to a code; and (b) a decoder including aprocessor for executing an algorithm for deciding, following at most oneiteration of an iterative procedure for decoding a representation of thecodeword, and according to a criterion that depends on at least oneparameter selected from the group consisting of a syndrome of therepresentation of the codeword, a degree of the code, a rate of the codeand a resolution with which the representation of the codeword is readfrom the memory, whether to modify the iterative procedure.

Another embodiment provided herein is a receiver including: (a) ademodulator for demodulating a message received from a communicationchannel, thereby producing a representation of a codeword that encodes aplurality of bits according to a code; and (b) a decoder including aprocessor for executing an algorithm for deciding, following at most oneiteration of an iterative procedure for decoding the representation ofthe codeword, and according to a criterion that depends on at least oneparameter selected from the group consisting of a syndrome of therepresentation of the codeword, a degree of the code and a rate of thecode, whether to modify the iterative procedure.

Another embodiment provided herein is a communication system fortransmitting and receiving a message, including: (a) a transmitterincluding: (i) an encoder for encoding a plurality of bits as a codewordaccording to a code, and (ii) a modulator for transmitting the codewordvia a communication channel as a modulated signal; and (b) a receiverincluding: (i) a demodulator for receiving the modulated signal from thecommunication channel and for demodulating the modulated signal, therebyproducing a representation of the codeword, and (ii) a decoder includinga processor for executing an algorithm for deciding, following at mostone iteration of an iterative procedure for decoding the representationof the codeword, and according to a criterion that depends on at leastone parameter selected from the group consisting of a syndrome of therepresentation of the codeword, a degree of the code and a rate of thecode, whether to modify the iterative procedure.

Another embodiment provided herein is a computer readable storage mediumhaving computer readable code embodied on the computer readable storagemedium, the computer readable code for decoding a representation of acodeword that encodes a plurality of bits according to a code, thecomputer readable code including program code for deciding, following atmost one iteration of an iterative procedure for decoding therepresentation of the codeword, and according to a criterion thatdepends on at least one parameter selected from the group consisting ofa syndrome of the representation of the codeword, a degree of the codeand a rate of the code, whether to modify the iterative procedure.

Another embodiment provided herein is a method of decoding arepresentation of a codeword that has been encoded according to a code,including: (a) reading the representation of the codeword from a memory;(b) deciding, according to a criterion that depends on a degree of thecode, whether and how to modify at least one parameter used in thereading; and (c) if the decision is to modify the at least oneparameter: to modify the at least one parameter and to re-read therepresentation of the codeword using the at least one parameter as somodified.

Another embodiment provided herein is a memory controller forcontrolling a memory, including: (a) an encoder for encoding a pluralityof bits as a codeword according to a code; (b) a decoder including aprocessor for executing an algorithm for deciding, according to acriterion that depends on a degree of the code, whether and how tomodify at least one parameter that has been used to read arepresentation of the codeword from the memory; and (c) circuitry: (i)for reading the representation of the codeword from the memory, and (ii)if the decision is to modify the at least one parameter: (A) formodifying the at least one parameter in accordance with the decision,and (B) for re-reading the representation of the codeword from thememory using the at least one parameter as so modified.

Another embodiment provided herein is a memory device including: (a) amemory; and (b) a memory controller, for controlling the memory,including: (i) an encoder for encoding a plurality of bits as a codewordaccording to a code, (ii) a decoder including a processor for executingan algorithm for deciding, according to a criterion that depends on adegree of the code, whether and how to modify at least one parameterthat has been used to read a representation of the codeword from thememory; and (iii) circuitry: (A) for reading the representation of thecodeword from the memory, and (B) if the decision is to modify the atleast one parameter: (I) for modifying the at least one parameter inaccordance with the decision, and (II) for re-reading the representationof the codeword from the memory using the at least one parameter as somodified.

Another embodiment provided herein is a computer readable storage mediumhaving computer readable code embodied on the computer readable storagemedium, the computer readable code for decoding a representation, of acodeword that encodes a plurality of bits according to a code, that hasbeen read from a memory, the computer readable code including programcode for deciding, according to a criterion that depends on a degree ofthe code, whether and how to modify at least one parameter that has beenused to read the representation of the codeword from the memory.

Two general methods are provided herein for decoding a representation,that has been imported from a channel, of a codeword that encodes Kinformation bits as N>K codeword bits. Most embodiments of the methodsare applicable to block codes generally, not to just message-passingcodes (codes, such as LDPC codes, that typically are decoded by usingmessage passing decoding algorithms). In both general methods, estimatesof the codeword bits are updated in a plurality of decoding iterations.The decoding iterations are interrupted if an interruption criterion issatisfied.

According to the first general method, the interruption criterion isorder-dependent. As defined below, an order-dependent interruptioncriterion is a criterion that distinguishes among vectors, such assyndrome vectors, whose elements have the same values but in differentorders. This is in contrast to “order-independent” criteria that aredefined below as criteria that inspect properties, of vectors that areinvolved in the decoding, that are independent of the order in whichspecific values of the vector elements appear in the vectors.

In message passing decoding algorithms used to decode codes such as LDPCcodes, the updating includes, in a graph that includes N bit nodes andN−K check nodes, exchanging messages between the bit nodes and the checknodes. Preferably, the interrupting includes modifying at least oneelement of at least one vector, such as the vector of Q_(v)'s, thevector of Q_(vc)'s, and/or the vector of R_(cv)'s, that is associatedwith the decoding, and then resuming the decoding iterations. Alsopreferably, the interruption criterion includes that a norm of absolutevalues of LLR estimates of selected codeword bits fails to increasewithin a predetermined number of iterations. Most preferably, the methodalso includes partitioning at least a portion of the graph into aplurality of sub-graphs. At least a portion of the message exchange iseffected separately within each sub-graph, and the selected codewordbits are codeword bits within one of the subgraphs.

Equivalently, in message passing decoding algorithms used to decodecodes such as LDPC codes, the updating includes, in a parity checkmatrix that includes N−K rows and N columns, exchanging messages betweenthe rows and the columns. Preferably, the interrupting includesmodifying at least one element of at least one vector, such as thevector of Q_(v)'s, the vector of Q_(vc)'s, and/or the vector ofR_(cv)'s, that is associated with the decoding, and then resuming thedecoding iterations. Also preferably, the interruption criterionincludes that a norm of absolute values of LLR estimates of selectedcodeword bits fails to increase within a predetermined number ofiterations. Most preferably, the method also includes partitioning atleast a portion of the parity check matrix into a plurality ofsub-matrices. At least a portion of the message exchange is effectedseparately within each sub-matrix, and the selected codeword bits arecodeword bits within one of the sub-matrices.

Alternatively, the interrupting consists of terminating the decodingiterations.

One preferred order-dependent interruption criterion includes thefailure of a norm of absolute values of LLR estimates of selectedestimated codeword bits failing to increase within a predeterminednumber of iterations. Most preferably, the selected estimated codewordbits are estimated codeword bits that contribute to non-zero elements ofthe syndrome, i.e., estimated codeword bits that correspond to 1's inrows of the parity check matrix that correspond to non-zero elements ofthe syndrome.

Another preferred order-dependent interruption criterion includes thatthe numbers of zero elements in runs of consecutive zero elements of thesyndrome do not tend to increase, either within one decoding iterationor across the border between two decoding iterations.

Another preferred order-dependent interruption criterion includes thatthe largest number of consecutive zero elements of the syndrome does notincrease from one decoding iteration to the next.

Another preferred order-dependent interruption criterion includes thatthe largest number of consecutive zero elements of the syndrome does notincrease monotonically across a pre-determined number (at least three)of consecutive decoding iterations.

According to the second general method, the interruption criterionincludes an estimate of mutual information between the codeword and avector that is used in the decoding iterations.

In message passing decoding algorithms used to decode codes such as LDPCcodes, the updating includes, in a graph that includes N bit nodes andN−K check nodes, exchanging messages between the bit nodes and the checknodes. Preferably, the interrupting includes modifying at least oneelement of at least one vector (possibly but not necessarily the vectorused in the interruption criterion), such as the vector of Q_(v)'s, thevector of Q_(vc)'s, and/or the vector of R_(cv)'s, that is associatedwith the decoding, and then resuming the decoding iterations. Mostpreferably, the vector, the estimate of mutual information between whichand the codeword is included in the interruption criterion, is a vectorof LLR estimates Q of the codeword bits, and the estimate of mutualinformation is

${{\frac{1}{E}{\sum 1}} - {\log_{2}\left( {1 + ^{- {Q}}} \right)}},$

where E is the number of edges in the graph.

Equivalently, in message passing decoding algorithms used to decodecodes such as LDPC codes, the updating includes, in a parity checkmatrix that includes N−K rows and N columns, exchanging messages betweenthe rows and the columns. Preferably, the interrupting includesmodifying at least one element of at least one vector (possibly but notnecessarily the vector used in the interruption criterion), such as thevector of Q_(v)'s, the vector of Q_(vc)'s, and/or the vector ofR_(cv)'s, that is associated with the decoding, and then resuming thedecoding iterations.

Alternatively, the interrupting consists of terminating the decodingiterations.

A decoder corresponding to one of the two general methods includes oneor more processors for decoding the representation of the codeword byexecuting an algorithm for updating the codeword bit estimates accordingto the corresponding general method.

A memory controller corresponding to one of the two general methodsincludes an encoder for encoding K information bits as a codeword of N>Kbits and a decoder that corresponds to the general method. Normally,such a memory controller includes circuitry for storing at least aportion of the codeword in a memory and for retrieving a (possiblynoisy) representation of the at least portion of the codeword from thememory. A memory device corresponding to one of the two general methodsincludes such a memory controller and also includes the memory.

A receiver corresponding to one of the two general methods includes ademodulator for demodulating a message received from a communicationchannel. The demodulator provides a representation of a codeword thatencodes K information bits as N>K codeword bits. Such a receiver alsoincludes a decoder that corresponds to the general method.

A communication system corresponding to one of the two general methodsincludes a transmitter and a receiver. The transmitter includes anencoder for encoding K information bits of a message as a codeword ofN>K codeword bits and a modulator for transmitting the codeword via acommunication channel as a modulated signal. The receiver is a receiverthat corresponds to the general method.

A computer-readable storage medium corresponding to one of the twogeneral methods has computer readable code embodied thereon for usingthe general method to decode a representation of a codeword thatincludes K information bits encoded as N>K codeword bits.

A third and a fourth general method are provided herein for decoding arepresentation of a codeword that has been encoded according to a code.

According to the third general method, the representation of thecodeword is imported from a channel. An iterative procedure is providedfor decoding the representation of the codeword. After only oneiteration of the procedure, or even before effecting any iterations ofthe procedure, it is decided, according to a criterion that depends onone or more parameters selected from among the syndrome of therepresentation of the codeword (a vector parameter), a degree of thecode (a scalar parameter) (in the case of a code such as a LDPC code thedegree could be either the check node degree d_(c) or the variable nodedegree d_(v); in the preferred embodiments described below d_(c) isused), the code rate (a scalar parameter) and (if the channel is astorage medium) the resolution with which the representation of thecodeword is read from the channel (in general a vector parameter),whether to modify the iterative procedure before continuing. (The checknode degree of a code such as a LDPC code is defined below as the numberof “1”s in each row of the parity check matrix (or the average number of“1”s if the rows do not all have the same number of “1”s).)

Optionally, if the criterion depends on the syndrome, the criterion isorder-independent.

Preferably, the criterion includes a bit error estimate. One preferredbit error estimate is

${q = {\frac{1}{2} - {\frac{1}{2}\left( {1 - \frac{2W}{M}} \right)^{1/\delta}}}},$

where M is the number of elements in the syndrome, W is the number ofnon-zero elements in the syndrome and δ is the relevant degree of thecode (typically δ=d_(c) in the case of a code such as a LDPC code).

A decoder corresponding to the third general method includes one or moreprocessors for implementing the decision step of the third generalmethod.

A memory controller corresponding to the third general method includesan encoder for encoding a plurality of bits as a codeword according to acode and a decoder that corresponds to the third general method.Normally, such a memory controller also includes circuitry for storingat least a portion of the codeword in a memory and for retrieving a(possibly noisy) representation of the codeword from the memory. Amemory device corresponding to the third general method includes such amemory controller and also includes the memory.

A receiver corresponding to the third general method includes ademodulator for demodulating a message received from a communicationchannel. The demodulator provides a representation of a codeword thatencodes a plurality of bits according to a code. Such a receiver alsoincludes a decoder that includes one or more processors for implementingthe decision step of the third general method relative to therepresentation of the codeword.

A communication system corresponding to the third general methodincludes a transmitter and a receiver. The transmitter includes anencoder for encoding a plurality of bits as a codeword according to acode and a modulator for transmitting the codeword via a communicationchannel as a modulated signal. The receiver is a receiver thatcorresponds to the third general method.

A computer-readable storage medium corresponding to the third generalmethod has computer readable code embodied thereon for implementing thedecision step of the third general method.

According to the fourth general method, a representation of the codewordis read from a memory. It is decided, according to a criterion thatdepends on the degree of the code, whether and how to modify at leastone parameter used in the reading of the representation of the codewordfrom the memory. If the decision is to modify the parameter(s) then theparameter(s) are modified and the representation of the codeword isre-read from the memory using the modified parameter(s).

Preferably, the memory is a flash memory and the parameter(s) is/arerespective values of one or more reference voltages and/or a number ofreference voltages.

Preferably, the criterion also depends on the syndrome of therepresentation of the codeword, in which case the criterion optionallyis order-independent.

Preferably, the criterion includes a bit error estimate.

A memory controller corresponding to the fourth general method includesan encoder for encoding a plurality of bits as a codeword according to acode, and a decoder that includes one or more processors forimplementing the decision step of the fourth general method. The memorycontroller also includes circuitry for reading the representation of thecodeword from the memory and circuitry for modifying the readingparameter(s) and for re-reading the representation of the codeword fromthe memory using the modified reading parameter(s) if the decision is tomodify the reading parameter(s). A memory device corresponding to thefourth general method includes such a memory controller and also thememory that the memory controller controls.

A computer-readable storage medium corresponding to the fourth generalmethod has computer readable code embodied thereon for implementing thedecision step of the fourth general method.

BRIEF DESCRIPTION OF THE DRAWINGS

Various embodiments are herein described, by way of example only, withreference to the accompanying drawings, wherein:

FIG. 1 shows how a LDPC code can be represented as either a sparseparity check matrix or a sparse bipartite graph;

FIG. 2 shows a flooding schedule belief propagation algorithm;

FIG. 3 shows a conventional serial schedule belief propagationalgorithm;

FIG. 4 illustrates error floor;

FIG. 5 shows how messages are exchanged within a sub-graph and between asub-graph and a set of external check nodes;

FIG. 6 shows a belief propagation algorithm in which messages areexchanged within sub-graphs and between the sub-graphs and a set ofexternal check nodes;

FIGS. 7A and 7B are high-level schematic block diagrams of decoders forimplementing the algorithm of FIG. 6;

FIGS. 8 and 9 show two ways of partitioning the sparse bipartite graphof FIG. 1 into sub-graphs;

FIG. 10 is a high-level schematic block diagram of a flash memory devicewhose controller includes the decoder of FIG. 7A;

FIG. 11 is a detail of FIG. 10;

FIG. 12 is a high-level schematic block diagram of a communicationsystem whose receiver includes the decoder of FIG. 7A;

FIG. 13 is an exemplary plot of counts of successive “0”s in a syndromein which lengths of runs of such successive “0”s tend to increase;

FIG. 14 is an exemplary plot of counts of successive “0”s in a syndromein which lengths of runs of such successive “0”s do not tend toincrease;

FIGS. 15 and 16 show threshold voltage distributions and referencevoltages for a three-bit-per-cell flash memory.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

The principles and operation of low-complexity LPDC decoding and of LPDCdecoding that overcomes persistent non-convergence such asnon-convergence due to trapping sets may be better understood withreference to the drawings and the accompanying description.

In conventional decoders for LDPC codes, the memory required by thedecoder is proportional to the code length N (equal to the number ofvariable nodes in the code's underlying graph |V|) and to the number ofedges in the code's underlying graph |E|. In efficient implementations(e.g. based on serially scheduled decoders), the required memory can beas small as (|V|+|E|)*bpm bits, where |V| is the number of bitestimations, |E| is the number of edge messages and bpm is the number ofbits per message stored in the memory of the decoder (note that weassume here that the same number of bits is required for storing bitestimation and edge message, for the sake of simplicity, though this isnot necessarily the case). The decoder presented herein uses muchsmaller memory for implementing the decoding, storing only a smallfraction of the |V| bit estimations and of the |E| edge messagessimultaneously, without any degradation in decoder's error correctioncapability, compared to a conventional decoder, assuming sufficientdecoding time is available. This is achieved by employing an appropriatedecoding schedule and using the decoding hardware described herein.

The methods and decoders described herein operate by dividing theunderlying graph representing the code into several sections and toimplement the message passing decoding algorithm by sequentiallyprocessing the different sections of the graph, one or more sections ata time. At each stage during decoding only the bit estimations and edgemessages corresponding to the graph section(s) that is/are currentlybeing processed are stored. This way a very long LDPC code can beemployed, providing near optimal error correction capability and verylow error floor, while utilizing a low complexity decoding hardware.

The decoders presented herein are highly suitable for usage in memorydevices, principally for the three following reasons:

-   1. A low ECC error floor is especially important in memory devices,    which have severe decoder output BER requirements (<10⁻¹⁵). When    short codes are used, achieving such low error floor is very hard    and usually requires sacrificing the error correction capability of    the code, which is already compromised due to the short length of    the code. Therefore using an equivalent long code the error    correction capability of the code is improved, and thus lower ECC    redundancy is required for protecting information against a given    memory “noise” which corrupts the stored data. This in turn results    in better cost efficiency of the memory, because a larger amount of    information can be stored in a given number of memory cells (or    using a given memory silicon size). Hence, employing a long ECC in    memory devices is expected to provide a significant advantage.-   2. The LDPC methods presented herein allow for processing a section    of the code's underlying graph at each processing phase, instead of    the entire graph at once. This means that we can store only a part    of the “soft” bit estimations at each phase and not all of the    “soft” bit estimations at once. Here the term “soft” bit estimates    refers to a collection of bits describing the reliability of an    estimate ‘y’ for each stored bit deduced from reading from the    storage (possibly flash device).    -   This feature can be easily utilized in a memory device, because        only the presently required bit observations (y) can be read        from the storage device, hence there is no need for a large        buffer in the memory controller in order to implement the ECC        decoding. Alternatively, even if all bit observations        (represented by the vector y) are read from the memory at once,        the buffer required for storing them is usually much smaller        than the memory required for storing the bit observations (the        P_(v) messages) required by the decoder. This way, only part of        the soft bit estimates corresponding to the graph section that        is currently being processed by the decoder are generated each        time, resulting in a smaller decoder memory requirement.    -   Consider for example a SLC Flash memory device (a Flash memory        device that stores one bit per cell; “SLC” means “Single Level        Cell” and actually is a misnomer because each cell supports two        levels; the “S” in “SLC” refers to there being only one        programmed level.), in which each cell stores a single bit v and        the state y read from each cell can be either 0 or 1. Then the        memory needed for storing the vector y of read cell states is N        bits. On the other hand, the memory required for storing all the        soft bit estimates (P_(v) messages) can be larger (for example        6N bits if each LLR estimate is stored in 6 bits). Hence, it is        more efficient to generate only the required soft bit estimates        in each decoder activation. A LLR bit estimate

$P_{v} = {\log \frac{\Pr \left( {v = \left. 0 \middle| y \right.} \right)}{\Pr \left( {v = \left. 1 \middle| y \right.} \right)}}$

for some bit v can be generated from the corresponding bit observationsy that are read from the flash memory device based on an a-prioriknowledge of the memory “noise”. In other words, by knowing the memory“noise” statistics we can deduce the probability that a bit v that wasstored in a certain memory cell is 0/1 given that ‘y’ is read from thecell.

-   -   For example, assume that in a certain SLC Flash memory device        the probability of reading the state of the cell different than        the one it was programmed to is p=10⁻², then if y=0 then

$P_{v} = {{\log \frac{1 - p}{p}} = 4.6}$

and if y=1 then

$P_{v} = {{\log \frac{p}{1 - p}} = {- {4.6.}}}$

Furthermore, if the number of states that can be read from each cell ofthe flash device (represented by ‘y’) is 8 because the cell stores asingle bit (one “hard bit”) and the device is configured to read eightthreshold voltage levels, equivalent to two ‘soft bits”, then eachelement ‘y’ which requires, in the controller, storage for 3 bits, isconverted to an LLR value P_(v) that may be represented as more than 3bits, for example as 6 bits (BPM=Bits Per Message=6). These 6 bits are asoft bit estimate as opposed to the 2 soft bits read from the flash celland corresponding to this 6-bit LLR value.

-   3. A decoding schedule of the type presented herein allow for a    smaller memory requirement (compared with conventional decoding    schedules). However, the decoding schedules presented herein might    slow down the decoder convergence rate and increase the decoding    time, especially when operating near the decoder's maximal error    correction capability. Such a decoder is highly suitable for memory    devices, which can tolerate variable ECC decoding latencies. For    example, if the required decoding time for the ECC to converge to    the correct stored codeword is long due to a high number of    corrupted bits, then the memory controller can stop reading the    memory until the decoding of the previously read codeword is    finalized. Note that during most of a flash memory device's life,    the memory “noise” is small and the number of corrupted bits is    small. Hence, the decoder operates efficiently and quickly, allowing    for an efficient pipelined memory reading. Rarely, the number of    corrupted bits read from the memory is high, requiring longer    decoding time and resulting in a reading pipeline stall. Therefore    on average the throughput is left unharmed even with these variable    decoding time characteristics.

According to one class of embodiments, the bipartite graph G=(V,C,E)that represents the code is divided into several sections in thefollowing way. 1) Divide the set V of bit nodes into t disjoint subsets:V₁, V₂, . . . , V_(t) (such that V=V₁∪V₂∪ . . . ∪V_(t)). 2) For eachsubset V_(i) of bit nodes, form a subset C_(i) of check nodes, includingall of the check nodes that are connected solely to the bit nodes inV_(i). 3) Form a subset C_(J) of external check nodes, including all ofthe check nodes that are not in any of the check node subsets formed sofar, i.e. C_(J)=C\(C₁∪C₂∪ . . . ∪C_(t)). 4) Divide the graph G into tsub-graphs G₁, G₂, . . . , G_(t) such that G_(i)=(V_(i), C_(i), E_(i))where E_(i) is the set of edges connected between bit nodes in V_(i) andcheck nodes in C_(i). Denote the edges connected to the set C_(J) byE_(J) (note that E_(J)=E\(E₁∪E₂∪ . . . ∪E_(t))).

In these embodiments, the graph G is processed according to a specialmessage passing schedule, by iteratively performing decoding phases, andin each decoding phase exchanging messages along the graph edges in thefollowing order:

-   -   for i=1 through t    -   Send R_(cv) messages from check nodes cεC_(J) to bit nodes        vεV_(i) along edges in E_(J), depicted as the R_(CJVi) messages        in FIG. 5. Set R_(cv) messages from check nodes cεC_(i) to bits        nodes vεV, to zero, depicted by the Rc_(i)v_(i) messages in        FIG. 5. Set initial bit estimations to P_(v) for every bit        vεV_(i), depicted as the P_(Vi) messages in FIG. 5. Note that        the messages R_(CJVi) are the result of activating the decoder        for the other t−1 sub-graphs G_(k), k≠i, prior to this step. In        the event that other sub-graphs have not been processed yet,        their corresponding messages Q_(vicJ) in FIG. 5 are set to        P_(vi), i.e., the estimates read from the memory or received        from the communication channel. In case those are punctured        bits, their P_(vi)'s are zero.    -   2. Perform one or more iterations by sending Q_(vc) messages        from bit nodes in V_(i) to check nodes in C_(i), and R_(cv)        messages from check nodes in C_(i) to bit nodes in V_(i), along        the edges in E_(i), according to some schedule (e.g. according        to the serial schedule described in FIG. 3, performed by        serially traversing the check nodes in C_(i) and for each check        node sending the messages to and from that check node). This is        depicted as the Qv_(i)c_(i) and Rc_(i)v_(i) messages in FIG. 5.    -   3. Send Q_(vc) messages from bit nodes in V_(i) to check nodes        in C_(J) along the edges in E_(J), depicted as the Qv_(i)c_(J)        messages in FIG. 5.

Decoding continues until the decoder converges to a valid codeword,satisfying all the parity-check constraints, or until a maximum numberof allowed decoding phases is reached. The stopping criterion for themessage passing within each sub-graph i is similar: iterate until eitherall the parity-check constraints within this sub-graph are satisfied ora maximum number of allowed iterations is reached. In general, themaximum allowed number of iterations may change from one sub-graph toanother or from one activation of the decoder to another.

The messages sent along the edges in E_(J) (Ram messages and Qv_(i)c_(J)messages in FIG. 5) are used for exchanging information between thedifferent sections of the graph. The messages that are sent at eachstage during decoding can be computed according to the standardcomputation rules of the message passing decoding algorithm. Forexample, if BP decoding is implemented then the messages are computedaccording to equations (4) and (5). Other message-passing decodingalgorithms, such as Min Sum algorithms, Gallagher A algorithms andGallagher B algorithms, have their own computation rules.

Such a decoding algorithm, assuming serially scheduled message passingdecoding within each sub-graph, implementing BP decoding, is summarizedin FIG. 6. In this algorithm, at each stage during decoding only theQ_(v) messages corresponding to bit nodes vεV_(i), the R_(cv) messagescorresponding to the edges in E_(i) and the messages corresponding tothe edges in E_(J) are stored. Hence, the decoder of this class ofembodiments requires storing only (max {|V₁|, |V₂|, . . . , |V_(t)|}+max{|E₁|, |E₂|, . . . , |E_(t)|}+|E_(J)|) messages simultaneously, comparedto (|V|+|E|) messages in efficient conventional decoders. Thus thememory requirement is ˜1/t fraction of the memory required for aconventional decoder. When implementing long LDPC codes this provides asignificant advantage in a decoder's complexity.

A high-level schematic block diagram of an exemplary decoder 30according to this class of embodiments is shown in FIG. 7A. Decoder 30includes:

-   1. An initial LLRs computation block 32 that computes the initial    bit estimations P _(i)=[P_(v):vεV_(i)] for bits vεV_(t) in the    currently processed sub-graph G_(i)=(V_(i), C_(i), E_(i)), based on    the corresponding bit observations y _(i)=[y_(v):vεV_(i)] read from    the memory or received from the communication channel (where y_(v)    is the observation corresponding to bit v).-   2. A read/write memory 34 including a memory section 36 for storing    the bit estimations for bit nodes vεV_(i) in the currently processed    sub-graph (Q_(i) messages which are initialized as the P_(v)    messages).-   3. A read/write memory 35 including:    -   3a. A memory section 38 for storing the R_(cv) messages        corresponding to the edge set E_(i) of the currently processed        sub-graph.    -   3b. A memory section 40 for storing the messages along the edges        in E_(J). Memory section 40 stores: i) the Q_(vc) messages from        bit nodes vεV_(i′) ∀i′ε{1, . . . , n}\i to check nodes cεC_(J),        where i is the index of the currently processed sub-graph;        and ii) for bit nodes vεV_(i) memory section 40 first stores the        R_(cv) messages from check nodes cεC_(J) and afterwards the        sub-graph's processing memory section 40 stores the Q_(vc) to        check nodes cεC_(J).-   4. Processing units 42 for implementing the computations involved in    updating the messages (as shown in FIG. 6).-   5. A routing layer 44 that routes messages between memory 34 and    processing units 42. For example, in some sub-classes of this class    of embodiments, within the loop over sub-graphs G₁ through G_(t) in    FIG. 6, routing layer 44 assigns each processor 42 its own check    node of the current sub-graph G_(i) and the check node processing is    done in parallel for all the check nodes of G_(i) (or for as many    check nodes of G_(i) as there are processors 42).-   6. A read-only memory (ROM) 46 for storing the code's graph    structure. Memory addressing, and switching by routing layer 44, are    based on entries in ROM 46.

Decoder 30 includes a plurality of processing units 42 so that thecomputations involved in updating the messages may be effected inparallel. An alternative embodiment with only one processing unit 42would not include a routing layer 44.

As noted above, a serial passing schedule traverses serially either thecheck nodes or the bit nodes. Decoder 30 of FIG. 7A traverses the checknodes serially. FIG. 7B is a high-level schematic block diagram of asimilar decoder 31 that traverses the bit nodes serially.

An example of the graph partitioning according to this class ofembodiments is shown in FIG. 8. An LDPC code which is described by aregular bipartite graph with 18 bit nodes and 9 check nodes, such thatevery bit node is connected to two check nodes and every check node isconnected to four bit nodes, is used in this example. This is a length18, rate 1/2 LDPC code. The original graph is shown on the left side ofFIG. 8. This also is the graph of FIG. 1. The graph after partitioningits bit nodes, check nodes and edges into subsets is shown on the rightside of FIG. 8. Note that this is the same graph, only rearranged forsake of clarity. For this code, a prior art efficient decoder wouldrequire storing 18+36=54 messages, while the corresponding decoder 30requires storing only 6+8+12=26 messages, providing 52% reduction in thedecoder's memory complexity, while maintaining the same error correctioncapability.

It is preferred that all the sub-graphs be topologically identical, asin the example of FIG. 8. In this context, “topological identity” meansthat all the sub-graphs have equal numbers of bit nodes and equalnumbers of check nodes; that each bit node has a corresponding bit nodein every other sub-graph in terms of connectivity to internal checknodes; and that each sub-graph check node has a corresponding check nodein every other sub-graph in terms of connectivity to bit nodes. Forexample, in FIG. 8:

-   Bit nodes 1, 5, 11, 13, 16 and 17 correspond because bit nodes 1 and    5 are connected to both check nodes of sub-graph 1, bit nodes 11 and    16 are connected to both check nodes of sub-graph 2, bit nodes 13    and 17 are connected to both check nodes of sub-graph 3, and none of    these bit nodes is connected to an external check node (a check node    of set C_(J)).-   The remaining bit nodes correspond because each of these bit nodes    is connected to one check node of the same sub-graph.-   All the check nodes of the sub-graphs correspond because each one of    these check nodes is connected to the two bit nodes of its sub-graph    that are connected only to sub-graph check nodes and to two other    bits of its sub-graph that are also connected to external check    nodes.    Note that the sub-graphs need not have identical connectivity to the    external check nodes in order to be “topologically identical”. For    example, the two bit nodes, 15 and 18, of sub-graph 3, that are    connected to the same external check node 7, are also connected to    the same check node 9 of sub-graph 3, but the two bit nodes, 4 and    12, of sub-graph 1, that are connected to the same external check    node 2, are connected to different check nodes (3 and 8) of    sub-graph 1.

If need be, however, any LDPC graph G can be partitioned into sub-graphsby a greedy algorithm. The first sub-graph is constructed by selectingan arbitrary set of bit nodes. The check nodes of the first sub-graphare the check nodes that connect only to those bit nodes. The secondsub-graph is constructed by selecting an arbitrary set of bit nodes fromamong the remaining bit nodes. Preferably, of course, the number of bitnodes in the second sub-graph is the same as the number of bit nodes inthe first sub-graph. Again, the check nodes of the second sub-graph arethe check nodes that connect only to the bit nodes of the secondsub-graph. This is arbitrary selection of bit nodes is repeated as manytimes as desired. The last sub-graph then consists of the bit nodes thatwere not selected and the check nodes that connect only to those bitnodes. The remaining check nodes constitute C_(J).

In the class of embodiments described above, the LDPC graph G ispartitioned into t sub-graphs, each with its own bit nodes and checknodes, plus a separate subset C_(J) of only check nodes. In anotherclass of embodiments, as illustrated in FIG. 9, G is partitioned intojust t sub-graphs, each with its own bit nodes and check nodes. Forexample, using the greedy algorithm described above, the last sub-graph(G_(t)) includes the non-selected bit nodes, the check nodes thatconnect only to these bit nodes, and also all the remaining check nodes.This is equivalent to the set C_(J) of the first class of embodimentsbeing connected to its own subset of bit nodes separate from the bitnodes of the sub-graphs. In this class of embodiments, the algorithm ofFIG. 6 is modified by including only sub-graphs G₁ through G_(t-1) inthe sub-graphs loop and ending each decoding phase by following thesub-graphs loop with a separate exchange of messages exclusively withinG_(t). FIG. 9 shows the case of t=4. In one sub-class of theseembodiments, some of the bits are punctured bits, and G_(t) is dedicatedto these bits: all the bits of G_(t) are punctured bits, and all thepunctured bits are bits of G_(t).

FIG. 10 is a high-level schematic block diagram of a flash memorydevice. A memory cell array 1 including a plurality of memory cells Marranged in a matrix is controlled by a column control circuit 2, a rowcontrol circuit 3, a c-source control circuit 4 and a c-p-well controlcircuit 5. Column control circuit 2 is connected to bit lines (BL) ofmemory cell array 1 for reading data stored in the memory cells (M), fordetermining a state of the memory cells (M) during a writing operation,and for controlling potential levels of the bit lines (BL) to promotethe writing or to inhibit the writing. Row control circuit 3 isconnected to word lines (WL) to select one of the word lines (WL), toapply read voltages, to apply writing voltages combined with the bitline potential levels controlled by column control circuit 2, and toapply an erase voltage coupled with a voltage of a p-type region onwhich the memory cells (M) are formed. C-source control circuit 4controls a common source line connected to the memory cells (M).C-p-well control circuit 5 controls the c-p-well voltage.

The data stored in the memory cells (M) are read out by column controlcircuit 2 and are output to external I/O lines via an I/O line and adata input/output buffer 6. Program data to be stored in the memorycells are input to data input/output buffer 6 via the external I/Olines, and are transferred to column control circuit 2. The external I/Olines are connected to a controller 20.

Command data for controlling the flash memory device are input to acommand interface connected to external control lines which areconnected with controller 20. The command data inform the flash memoryof what operation is requested. The input command is transferred to astate machine 8 that controls column control circuit 2, row controlcircuit 3, c-source control circuit 4, c-p-well control circuit 5 anddata input/output buffer 6. State machine 8 can output a status data ofthe flash memory such as READY/BUSY or PASS/FAIL.

Controller 20 is connected or connectable with a host system such as apersonal computer, a digital camera, a personal digital assistant. It isthe host which initiates commands, such as to store or read data to orfrom the memory array 1, and provides or receives such data,respectively. Controller 20 converts such commands into command signalsthat can be interpreted and executed by command circuits 7. Controller20 also typically contains buffer memory for the user data being writtento or read from the memory array. A typical memory device includes oneintegrated circuit chip 21 that includes controller 20, and one or moreintegrated circuit chips 22 that each contain a memory array andassociated control, input/output and state machine circuits. The trend,of course, is to integrate the memory array and controller circuits ofsuch a device together on one or more integrated circuit chips. Thememory device may be embedded as part of the host system, or may beincluded in a memory card that is removably insertable into a matingsocket of host systems. Such a card may include the entire memorydevice, or the controller and memory array, with associated peripheralcircuits, may be provided in separate cards.

FIG. 11 is an enlarged view of part of FIG. 10, showing that controller20 includes an encoder 52 for encoding user data received from the hostas one or more codewords, circuitry 54 for instructing command circuits7 to store the codewords (or only the non-punctured bits thereof, if anyof the bits of the codewords are punctured bits) in memory cell array 1and for instructing command circuits 7 to retrieving the storedcodewords (or the stored portions thereof in the punctured bit case)from memory cell array 1, and decoder 30 for decoding the representationof the codewords as retrieved by circuitry 54. Alternatively, controller20 could include decoder 31 instead of decoder 30.

Note that the location of the dashed vertical line in FIG. 10 thatseparates integrated circuit chips 21 and 22 is somewhat arbitrary.Command circuits 7 and state machine 8 could be fabricated together withcontroller 20 on integrated circuit chip 21, in which case thecombination of controller 20, command circuits 7 and state machine 8would be considered a memory controller for memory cell array 1 and theremaining control circuits 2, 3, 4, 5 and 6 on integrated circuit chip22.

Although the methods and the decoders disclosed herein are intendedprimarily for use in data storage systems, these methods and decodersalso are applicable to communications systems, particularlycommunications systems that rely on wave propagation through media thatstrongly attenuate high frequencies. Such communication is inherentlyslow and noisy. One example of such communication is radio wavecommunication between shore stations and submerged submarines.

FIG. 12 is a high-level schematic block diagram of a communicationsystem 100 that includes a transmitter 110, a channel 103 and a receiver112. Transmitter 110 includes an encoder 101 and a modulator 102.Receiver 112 includes a demodulator 104 and decoder 30. Encoder 101receives a message and generates a corresponding codeword. Modulator 102subjects the generated codeword to a digital modulation such as BPSK,QPSK or multi-valued QAM and transmits the resulting modulated signal toreceiver 12 via channel 103. At receiver 112, demodulator 104 receivesthe modulated signal from channel 103 and subjects the receivedmodulated signal to a digital demodulation such as BPSK, QPSK ormulti-valued QAM. Decoder 30 decodes the resulting representation of theoriginal codeword as described above. Alternatively, receiver 112 couldinclude decoder 31 instead of decoder 30.

Returning now to the issue of slow convergence of a block decoder, forexample, slow convergence of a LDPC decoder because of the presence of atrapping set, most conventional criteria for interrupting iterativeblock decoding (e.g., terminating the decoding, or else modifying someof the values associated with the nodes of a Tanner graph and thenresuming the decoding) are what are termed herein “order-independent”interruption criteria. These criteria inspect properties, of vectorsthat are involved in the iterative decoding, that are independent of theorder in which specific values of the vector elements appear in thevectors. For example, the most widely used criterion for testing theconvergence of a block decoder is to count the number of elements of thesyndrome H·v′, where v′ is the column vector of estimated bits, that arenon-zero at the end of successive iterations. A similar criterion forinterrupting a slowly converging LDPC decoder and resetting the R_(cv)messages to zero or truncating the soft values Q_(v) is that apredetermined number of elements (typically one element) of the syndromeare non-zero after a pre-determined number of iterations or after apre-determined time or after a pre-determined number of messageexchanges.

One interruption criterion previously suggested by the presentinventors, specifically, one of the criteria discussed above thatsuggest the presence of a trapping set, is an example of what are termedherein “order-dependent” interruption criteria. This criterion is thatparity check equations fail only within just one sub-graph of a graphthat has been partitioned into sub-graphs as described above, while allthe other parity check equations succeed. Such a restricted failuresuggests that the sub-graph either is a trapping set or includes atrapping set. This criterion distinguishes among syndromes in which “0”and “1” syndrome elements appear in different orders. In general,“order-dependent” criteria distinguish among vectors in which vectorelements having specific values appear in different orders. For example,an order-independent criterion that counts non-zero elements of asyndrome does not distinguish between the two syndromes

-   -   (000001000001)        and    -   (000000000011)        because both syndromes have two non-zero elements. An        order-dependent syndrome does distinguish between these two        syndromes. The appended claims generalize such “order-dependent”        criteria to order-dependent criteria generally for interrupting        the iterative decoding of any representation of a block        codeword. As noted above, the interruption of an iterative        decoding of a representation of a block codeword can be either a        termination of the decoding (giving up on a codeword        representation that is too noisy to decode) or a modification of        some of the vector elements involved in the decoding in        preparation for continued iterative decoding.

FIGS. 13 and 14 illustrate an order-dependent interruption criterionthat is useful for testing the convergence of block decoders generally,not just for testing for the presence of a LDPC trapping set. Thiscriterion inspects the lengths of runs of consecutive zeros in thesyndrome. In a decoding schedule that considers the parity checkequations successively, for example a serial schedule LDPC decoding thattraverses the check nodes, during a decoding iteration that improves theestimate of the codeword bits, it is expected that the lengths of theruns of consecutive “0”s in the syndrome should tend to increasemonotonically, as illustrated in FIG. 13. FIG. 13 is an exemplary plotof counts of successive “0”s in a syndrome in which the lengths of runsof consecutive “0”s tend to increase as the decoding iteration inquestion progresses. The count is initialized to zero at the start ofthe iteration and increased by one whenever the calculations performedin connection with a parity check equation (for example, the messageexchanges at a serial schedule LDPC check node) produce a “0” syndromeelement. When a “1” syndrome element is produced, the count is re-set tozero. That the lengths of runs of successive “0”s do not tend toincrease monotonically, as illustrated in FIG. 14, suggests a need tointerrupt the decoding. The runs of successive “0” elements inspected bythis criterion may be all within one decoding iteration or may span theboundary between two decoding iterations.

That the lengths of runs of consecutive “0”s “tends” to increaseincludes but is not limited to monotonic increase of the lengths ofconsecutive runs of “0”s. For example, in FIG. 13, the seventh run ofconsecutive “0”s is shorter than the sixth run of consecutive “0”s. Oneoperational definition of a tendency to increase is that a runningaverage of the lengths of consecutive runs of consecutive “0”s, computedwithin a user-defined sliding window over the consecutive runs ofconsecutive “0”s, increases monotonically. The selection of anappropriate window length is well within the skill of those ordinarilyskilled in the art.

Other useful order-dependent interruption criteria, that are evaluatedat the ends of two or more successive iterations, include the largestnumber of consecutive “0”s in the syndrome not increasing in twosuccessive decoding iterations and the largest number of “0”s in thesyndrome not increasing monotonically in a predetermined number ofdecoding iterations.

Returning to the issue of trapping sets, another useful criterion forsuggesting the presence of a trapping set in a LDPC graph is that a normof the absolute values of the Q_(v)'s within one of the sub-graphs failsto increase in successive iterations. This norm could be anymathematical norm function known to those skilled in the art. Exemplarynorms include a L1 norm

Σ|Q _(v)|

and a L2 norm

√{square root over (Σ|Q _(v)|²)}

where the sums are over the bit nodes of a sub-graph that is suspectedto be or to include a trapping set.

A similar interruption criterion is based on the norm of absolute valuesof the Q_(v)'s whose bit nodes contribute, via the associated codewordbit estimates, to the non-zero syndrome elements. That norm fails toincrease within a predetermined number of iterations, for example insuccessive iterations, suggests the presence of a trapping set, so thatthe decoding should be interrupted.

For example, a threshold value S of the number of non-zero elements ofthe syndrome (e.g., S=8 for a code with 900 check nodes) is predefined.An upper Q_(v) norm threshold (e.g. 8), a lower Q_(v) norm threshold(e.g. 5) and an iteration span t also are predefined. When the number ofnon-zero elements of the syndrome falls below the threshold S, the bitnodes that are directly connected, in the Tanner graph, to the checknodes with non-zero syndrome elements are inspected. The norm of theassociated Q_(v)'s whose absolute values are less than the predefinedlower Q_(v) norm threshold is computed. That norm fails to increase, orfails to increase above the predefined upper Q_(v) norm threshold, afteranother t iterations (t=1 is the “successive iteration” case) suggeststhe presence of a trapping set, so that the decoding should beinterrupted.

Another useful interruption criterion is based on the mutual informationbetween the codeword and a vector that is used in the decodingiterations. The mutual information between two random variables is aquantity that measures the dependence between the two variables. Themost common unit for measurement of info cation is in bit[s], whenbase-2 logarithms are used. Formally, the mutual information between twodiscrete random variables X and Y is defined as

${I\left( {X;Y} \right)} = {\sum\limits_{y \in Y}{\sum\limits_{x \in X}{{p\left( {x,y} \right)}\log \frac{p\left( {x,y} \right)}{{p_{1}(x)}{p_{2}(y)}}}}}$

where p(x,y) is the joint probability distribution function of X and Y,p₁(x) is the marginal probability distribution function of X, and p₂(y)is the marginal probability distribution function of Y. It can be shownthat, if the definition of codeword bit values are modified by mappingv=0 into {circumflex over (v)}=1 and v=1 into {circumflex over (v)}=−1,then in a decoding procedure that exchanges messages along the edges ofa Tanner graph, the mutual information between a codeword and its LLRestimate Q is

$I = {{\frac{1}{E}{\sum 1}} - {\log_{2}\left( {1 + ^{{- \hat{v}}\; \cdot Q_{v}}} \right)}}$

where E is the number of edges in the Tanner graph and the sum is overthe bit nodes. I is expected to increase steadily from iteration toiteration in a successful decoding. That I fails to increase fromiteration to iteration suggests that the decoding should be interrupted.

This expression for I is not useful as such for testing convergence ofLDPC decoding because the codeword v is unknown and is indeed what issought by the decoding; but if most of the Q_(v)'s are correct then thisexpression can be approximated as

$I \approx {{\frac{1}{E}{\sum 1}} - {\log_{2}\left( {1 + ^{- {Q_{v}}}} \right)}}$

Decoders 30 and 31 of FIGS. 7A and 7B respectively are modified easilyto account for non-convergence and for slow convergence as describedabove. Specifically, routing layer 44 is modified to detectnon-convergence or slow convergence according to the criteria describedabove, mad/write memory 35 is modified to zero out some or all of theR_(cv) values, and/or processors 42 are modified to truncate some or allof the Q_(v) values, in response to non-convergence or slow convergenceas determined by routing layer 44.

Returning to the issue of using order-independent criteria to decidewhether to interrupt block decoding, the conventional order-independentcriteria all evaluate the behavior of the iterative decoding across atleast two iterations. We have discovered order-independent criteria thatcan be applied after only one decoding iteration, or even beforecommencing any decoding iterations, to decide whether to modify (notjust interrupt) the block decoding.

One such criterion is based on an estimate of the bit error rate. For aLDPC code of check node degree d_(c) (i.e., the number of “1”s in eachrow of the parity-check matrix if the LDPC code is right regular, or theaverage number of “1”s per row if the rows do not all have the samenumber of “1”s if the LDPC code isn't right regular, is d_(c)), it canbe shown that a good estimate of the bit error rate of a givenrepresentation of a codeword is

$q = {\frac{1}{2} - {\frac{1}{2}\left( {1 - \frac{2W}{M}} \right)^{1/d_{c}}}}$

where M is the number of elements in the syndrome and W is the number ofnon-zero elements (i.e., the number of “1”s) in the syndrome. W also iscalled the “syndrome weight”. Solving for W/M in terms of q gives

$\frac{W}{M} = {\frac{1}{2} - {\frac{1}{2}\left( {1 - {2q}} \right)^{d_{c}}}}$

For example, if the representation of the codeword has been read from amemory with a bit error rate of q=0.5% and the error correction code isa regular LDPC code with d_(c)=30, then W/M is expected to be at mostabout 0.13. If, before decoding or after one decoding iteration; W/M isgreater than e.g. 0.15, the decoding should at least be interrupted, ifnot actually modified e.g. by substituting a powerful but slow decodingalgorithm for a weak but fast decoding algorithm.

In the case of a representation of a codeword that is obtained byreading a flash memory, if W/M is greater than expected before decodingor after a pre-determined number of decoding iterations, another optionis to change how the representation of the codeword is read. Commandcircuits 7 and state machine 8 (FIG. 10) read a cell of memory cellarray 1 by comparing the cell's threshold voltage to one or morereference voltages. FIG. 15 shows, conceptually, the nominal thresholdvoltage distributions of a typical flash memory that stores three bitsper cell. The threshold voltages of cells that are programmed to storethe bit pattern “111” are distributed statistically as shown by curve210. The threshold voltages of cells that are programmed to store thebit pattern “110” are distributed statistically as shown by curve 212.The threshold voltages of cells that are programmed to store the bitpattern “101” are distributed statistically as shown by curve 214. Thethreshold voltages of cells that are programmed to store the bit pattern“100” are distributed statistically as shown by curve 216. The thresholdvoltages of cells that are programmed to store the bit pattern “011” aredistributed statistically as shown by curve 218. The threshold voltagesof cells that are programmed to store the bit pattern “010” aredistributed statistically as shown by curve 220. The threshold voltagesof cells that are programmed to store the bit pattern “001” aredistributed statistically as shown by curve 222. The threshold voltagesof cells that are programmed to store the bit pattern “000” aredistributed statistically as shown by curve 224.

To read a cell that has been programmed according to the thresholdvoltage distributions shown in FIG. 15, command circuits 7 and statemachine 8 (FIG. 10) compare the cell's threshold voltage to thereference voltages V1, V2, V3, V4, V5, V6 and V7 that appear along thehorizontal axis of FIG. 15 and that demark the boundaries between thevarious nominal threshold voltage distributions. Although the cells ofmemory array 1 (in FIG. 10) are programmed according to the thresholdvoltage ranges shown in FIG. 15, the actual threshold voltages of thecells may drift downward or upward over time and this drift maycontribute to a greater than expected value of W/M. One approach tosolving this problem is to move the reference voltages downward orupward, correspondingly, to compensate for the drift. Finding the valuesof the reference voltages that minimize W/M is a conventionalmultidimensional optimization problem that typically requires re-readingthe cells that store the codeword in question using different trialvalues of the reference voltages. Controller 20 (FIG. 10) optimizes thereference voltages accordingly and instructs command circuits 7 to usethe revised reference voltages to read the relevant cells of memoryarray 1.

Alternatively, it may be beneficial to read the threshold voltages ofthe relevant cells at a finer resolution, as illustrated in FIG. 16 thatshows the reference voltages of FIG. 15 supplemented by intermediatereference voltages V0.5, V1.5, V2.5, V3.5, V4.5, V5.5, V6.5 and V7.5.

The discussion above of W/M assumes that all of the stored bits areequally reliable (bit error rate q). W/M also can be estimated for thecase of different stored bits having different reliabilities. Forexample, in the bit pattern assignments of FIG. 15, the leastsignificant bit changes between adjacent threshold voltage distributioncurves, the bit of intermediate significance changes between pairs ofadjacent threshold voltage distribution curves, and the most significantbit changes only between threshold voltage distribution curves 216 and218. A downward drift over time of the threshold voltage of any cellother than a cell that stores the bit pattern “111” may lead to theleast significant bit stored in that cell being read erroneously. Adownward drift over time of the threshold voltage of a cell that storesone of the bit patterns “101”, “011” or “001” is more likely to lead tothe bit of intermediate significance that is stored in that cell beingread erroneously than is a downward drift over time of the thresholdvoltage of a cell that stores one of the other bit patterns. A downwarddrift over time of the threshold voltage of a cell that stores the bitpattern “011” is more likely to lead to the most significant bit storedin that cell being read erroneously than is a downward drift over timeof the threshold voltage of a cell that stores any other bit pattern.Therefore, in a flash memory that uses the bit pattern assignment ofFIG. 15, the most significant bits are more reliable than the bits ofintermediate significance and the bits of intermediate significance aremore reliable than the least significant bits. If the bit error rate ofthe most significant bits is q, the bit error rate of the bits ofintermediate significance is 2q and the bit error rate of the leastsignificant bits is 4q it can be shown that

$\frac{W}{M} = \sqrt{1 - \left\lbrack {\left( {1 - {2q}} \right)\left( {1 - {4q}} \right)\left( {1 - {8q}} \right)} \right\rbrack^{d_{c}/3}}$

The bit error rate q can be estimated for a specific code and flashtechnology via empirical measurements. For example, in each of a seriesof off-line simulations or measurements for a specific flash device, acodeword is stored in the flash device and is read from the flashdevice, the bit error rate (BER) or cell error rate (CER) is computed,and the syndrome weight W is computed. For each syndrome weight valuethe weighted average of the BER (or CER) is computed and tabulated. Thistable is then employed during the lifetime of the device by itscontroller in order to estimate the BER (or CER) from the syndromeweight of each codeword representation that is read from the device.Such an empirical model is preferred over the above formula because theabove formula assumes, inaccurately, that the probabilities of error foreach of d_(c) bits in each check are independent. In a real lifescenario this is not the case since for example two bits may arrive fromthe same flash cell, or one cell may induce cross coupling noise onother cells and thus even bits from different cells may have adependency between their error probabilities if they both are affectedby the same disturbing cell. Another case in which an empirical methodfor estimating the BER (or CER) might be preferred over the formulationpresented above is when for each page of a multi-page flash memory thebit error rate is different. Furthermore, assuming that the ratiobetween the bit error rates of different pages is completely known (asin the above formula for 3 bits per cell) is valid only if the noisemodel is well defined, for example if the noise model is a Gaussianmodel. Unfortunately, in a real life flash device this often is not thecase. Furthermore, in decoding a codeword read from a wordline of amulti-page flash memory, the number of bits from the lower page of onecheck can be different than the number of bits from the lower page ofanother check, which makes our assumption that the bits participating ineach check uniformly divided between the pages of the wordlineinaccurate.

The empirical simulations or measurements yield a table of W vs. q. Whena codeword representation is read from the flash device, W is computedand the corresponding q is looked up in the table. If, for a specificcode rate and reading resolution, q is too large, the memory controllerdecides whether it is worthwhile to re-read the codeword representationat a higher resolution, and if so, at what resolution, or,alternatively, whether it is necessary to change the reference voltagevalues that are used to read the flash memory.

The foregoing has described a limited number of embodiments of methodsfor decoding a representation of a codeword, of decoders that use thesemethods, of memories whose controllers include such decoders, and ofcommunication systems whose receivers include such decoders. It will beappreciated that many variations, modifications and other applicationsof the methods, decoders, memories and systems may be made.

1.-22. (canceled)
 23. A method of decoding a representation of acodeword that encodes K information bits as N>K codeword bits, themethod comprising: (a) importing the representation of the codeword froma channel; (b) in a plurality of decoding iterations, updating estimatesof the codeword bits; and (c) interrupting the decoding iterations if aninterruption criterion, that includes an estimate of mutual informationbetween the codeword and a vector that is used in the decodingiterations, is satisfied.
 24. The method of claim 23, wherein theupdating includes, in a graph that includes N bit nodes and N−K checknodes, exchanging messages between the bit nodes and the check nodes.25. The method of claim 24, wherein the interrupting includes modifyingat least one element of at least one vector associated with the decodingand then resuming the decoding iterations.
 26. The method of claim 24,wherein the vector, the estimate of mutual information between which andthe codeword is included in the interruption criterion, is a vector of NLLR estimates Q of the codeword bits and wherein the estimate of mutualinformation is${{\frac{1}{E}{\sum 1}} - {\log_{2}\left( {1 + ^{- {Q}}} \right)}},$where E is a number of edges in the graph.
 27. The method of claim 23,wherein the updating includes, in a parity check matrix that includesN−K rows and N columns, exchanging messages between the rows and thecolumns.
 28. The method of claim 27, wherein the interrupting includesmodifying at least one element of at least one vector associated withthe decoding and then resuming the decoding iterations.
 29. The methodof claim 23, wherein the interrupting is terminating the decodingiterations.
 30. A decoder for decoding a representation of a codewordthat encodes K information bits as N>K codeword bits, comprising aprocessor for executing an algorithm for decoding the representation ofthe codeword by steps including: in a plurality of decoding iterations,updating estimates of the codeword bits; and interrupting the decodingiterations if an interruption criterion, that includes an estimate ofmutual information between the codeword and a vector that is used in thedecoding iterations, is satisfied.
 31. A memory controller comprising:an encoder for encoding K information bits as a codeword of N>K codewordbits; and a decoder including a processor for executing an algorithm fordecoding a representation of the codeword by steps including: in aplurality of decoding iterations, updating estimates of the codewordbits, and interrupting the decoding iterations if an interruptioncriterion, that includes an estimate of mutual information between thecodeword and a vector that is used in the decoding iterations, issatisfied.
 32. The memory controller of claim 31, further comprising:circuitry for storing at least a portion of the codeword in a memory andfor retrieving a representation of the at least portion of the codewordfrom the memory.
 33. A memory device comprising: the memory controllerof claim 32; and the memory.
 34. A receiver comprising: a demodulatorfor demodulating a message received from a communication channel,thereby producing a representation of a codeword that encodes Kinformation bits as N>K codeword bits; and a decoder including aprocessor for executing an algorithm for decoding the representation ofthe codeword by steps including: in a plurality of decoding iterations,updating estimates of the codeword bits, and interrupting the decodingiterations if an interruption criterion, that includes an estimate ofmutual information between the codeword and a vector that is used in thedecoding iterations, is satisfied.
 35. A communication system fortransmitting and receiving a message, comprising: a transmitterincluding: an encoder for encoding K information bits of the message asa codeword of N>K codeword bits, and a modulator for transmitting thecodeword via a communication channel as a modulated signal; and areceiver including: a demodulator for receiving the modulated signalfrom the communication channel and for demodulating the modulatedsignal, thereby producing a representation of the codeword, and adecoder including a processor for executing an algorithm for decodingthe representation of the codeword by steps including: in a plurality ofdecoding iterations, updating estimates of the codeword bits, andinterrupting the decoding iterations if an interruption criterion, thatincludes an estimate of mutual information between the codeword and avector that is used in the decoding iterations, is satisfied.
 36. Acomputer readable storage medium having computer readable code embodiedon the computer readable storage medium, the computer readable code fordecoding a representation of a codeword that encodes K information bitsas N>K codeword bits, the computer readable code comprising: programcode for, in a plurality of decoding iterations, updating estimates ofthe codeword bits; and program code for interrupting the decodingiterations if an interruption criterion, that includes an estimate ofmutual information between the codeword and a vector that is used in thedecoding iterations, is satisfied. 37-56. (canceled)